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To help reveal the diversity and evolution of bat coronaviruses we collected 1067 bats from 21 species in China. 
A total of 73 coronaviruses (32 alphacoronaviruses and 41 betacoronaviruses) were identified in these bats, with 
an overall prevalence of 6.84%. All newly-identified betacoronaviruses were SARS-related Rhinolophus bat 
coronaviruses (SARSr-Rh-BatCoV). Importantly, with the exception of the S gene, the genome sequences of the 
SARSr-Rh-BatCoVs sampled in Guizhou province were closely related to SARS-related human coronavirus. 
Additionally, the newly-identified alphacoronaviruses exhibited high genetic diversity and some may represent 
novel species. Our phylogenetic analyses also provided insights into the transmission of these viruses among bat 
species, revealing a general clustering by geographic location rather than by bat species. Inter-species 
transmission among bats from the same genus was also commonplace in both the alphacoronaviruses and 
betacoronaviruses. Overall, these data suggest that high contact rates among specific bat species enable the 
acquisition and spread of coronaviruses. 


1. Introduction 

Coronaviruses (CoVs; family Coronaviridae ) are enveloped posi¬ 
tive-sense, single-stranded RNA viruses with the largest genomes (25- 
31 kb) among known RNA viruses (de Groot et al., 2011). Based on 
genome-scale phylogenies the known CoVs are classified into 30 
species within four genera: Alphacoronavirus, Betacoronavirus, 
Gammacoronavirus, and Deltacoronavirus (ICTV, 2017). 
Coronaviruses can infect humans, other mammals, and birds, causing 
respiratory, enteric, hepatic, and neurological diseases of varying 
severity (Masters and Perlman, 2013). Coronaviruses are well known 
globally due to the emergence of severe acute respiratory syndrome 
(SARS) during 2002-2003 caused by a previously unknown CoV 


(Ksiazek et al., 2003; Peiris et al., 2003). Subsequently, other two 
human CoVs (NL63 and HKU1) causing respiratory disease were 
identified (van der Hoek et al., 2004; Woo et al., 2005). Strikingly, 
the Middle East respiratory syndrome (MERS) that emerged in 2012 
and characterized by a higher mortality than SARS was also caused by a 
previously unknown CoV (Bermingham et al., 2012; Zaki et al., 2012). 
The ongoing emergence of these CoVs in humans means that CoVs will 
likely remain a key public health threat for the foreseeable future. 

Since the discovery of SARS-CoV in Himalayan palm civets (Guan 
et al., 2003), intense effort has been directed toward identifying and 
characterizing coronaviruses from animals globally. Consequently, a 
number of CoVs have been identified in a diverse range of vertebrates 
including domestic and wild mammals, and birds (Poon et al., 2005; 
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Wang et al., 2015; Woo et al., 2012). Bats (order Chiroptera), with 
more than 1240 species, have remarkable species diversity, and 
comprise more than 20% of living mammalian species (Nowak, 
1994). The discovery of SARS-related CoV in Rhinolophus horseshoe 
bats in China in 2005 (Lau et al., 2005; Li et al., 2005) attracted the 
global attention to these mammals, such that diverse Alpha- and Beta- 
CoVs have now been identified in a variety of bats globally over the past 
decade (Corman et al., 2013, 2014, 2015; Drexler et al., 2014; He et al., 
2014; Huang et al., 2016; Smith et al., 2016; Woo et al., 2012). More 
importantly, due to the close relationship between CoVs in bats and 
those causing human infections, it is believed that bats are the original 
source of human CoVs including SARS-CoV and MERS-CoV (Corman 
et al., 2015; Ge et al., 2013; Huynh et al., 2012; Ithete et al., 2013; Tao 
et al., 2017). Due to their high diversity and biological and ecological 
characteristics that potentially facilitate virus maintenance and trans¬ 
mission, bats likely harbor a large number of viruses, some of which 
may then jump to other species (Balboni et al., 2012). Hence, more 
effort is needed to identify and characterize the currently unrecognized 
CoVs that circulate in bats. 

At least 120 bat species are found in China, mainly distributed in 
the eastern, central, and southern regions of that country (Zhang et al., 
1997). Herein we report novel and diverse CoVs and SARS-related 
CoVs identified in Rhinolophus, Miniopterus, Murina and Myotis spp. 
bats sampled from several geographic regions of China. In addition, we 
inferred their genomic characteristics and evolutionary relationships 
with known viruses and their hosts. 


these bats were assigned to 21 species. The species and their 
abundance varied among regions, with Miniopterus schreibersii 
(47%) in Guizhou, Rhinolophus ferrumequinum (45%) and R. pusillus 
(32%) in Henan, and R. monoceros (41%) and R. sinicus (50%) in 
Zhejiang as the predominant species. Notably, only R. pusillus bats 
were found in all five regions. 

Using RT-PCR targeting a conserved fragment of the RdRp (RNA- 
dependent RNA polymerase) gene of CoV as described previously (Lau 
et al., 2005; Wang et al., 2015), viral RNA was identified in a total of 73 
bat fecal samples, with an overall detection rate of 6.84% (Table 1). 
Phylogenetic analysis revealed that all these viral sequences were 
clearly closely related to coronaviruses. Among the predominant bat 
species, CoV prevalence was high in R. monoceros (28/119, 23.53%) 
and M. schreibersii (16/198, 8.08%), but was lower in R. sinicus (8/ 
219, 3.65%), R. ferrumequinum (2/183, 1.09%) and R. pusillus (1/ 
154, 0.64%). 

Of the 73 newly-identified CoVs, 32 belong to alphacoronaviruses 
and 41 to betacoronaviruses. To better characterize these newly- 
identified bat CoVs, the complete viral RdRp gene sequence was 
obtained from 65 (89%) of the viral RNA positive bat samples. In 
addition, 5 complete and 4 near-complete viral genome sequences were 
successfully obtained from CoV positive samples. 

2.2. Newly-identified SARS-related Rhinolophus bat coronaviruses in 
bats 


2. Results 

2.1. Bats collected and prevalence of CoVs 

During 2012-2015 a total of 1067 bats were collected from five 
caves in five counties from Guizhou, Henan, and Zhejiang provinces 
(Fig. 1, Table 1 and SI). After morphological examination and 
sequence analysis of the mitochondrial cytochrome b (mt -cyt b ) gene, 


Genetic analysis of the conserved domains in the replicase poly¬ 
protein pplab - ADP-ribose 1-phosphatase (ADRP), nsp5 (3CLpro), 
nspl2 (RdRp), nspl3 (Hel), nspl4 (ExoN), nspl5 (NendoU) and nspl6 
(O-MT) - revealed that newly-identified bat betacoronaviruses shared 
more than 90% amino acid sequence identity with Severe acute 
respiratory syndrome-related coronavirus (SARSr-CoV) (Table S2) 
and clustered together in a phylogenetic analysis (Fig. 2). Hence, these 
data suggest that these bat viruses belong to the species SARSr-CoV 
according to the criteria for species demarcation in the subfamily 



Fig. 1. A map of China illustrating the location of trap sites in which bats (red circles) were captured. 
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Table 1 

Prevalence of coronaviruses in the bats collected during 2012-2015 in Guizhou, Henan and Zhejiang provinces, China. 


Species 

Guizhou 

Anlong 

Henan 

Neixiang 

Lushi 

Jiyuan 

Zhejiang 

Longquan 

Total 

Aselliscus stoliczkanus 

0/2 

_ 

_ 

_ 

_ 

0/2 

Barbastella beijingensis 

- 

- 

- 

0/2 

- 

0/2 

Hipposideros armiger 

1/35 

- 

- 

- 

- 

1/35 

Hypsugo savii 

- 

- 

- 

0/1 

- 

0/1 

Miniopterus ricketti 

- 

- 

0/1 

- 

- 

0/1 

Miniopterus schreibersii 

14/179 

2/19 

- 

- 

- 

16/198 

Murina leucogaster 

- 

3/21 

2/19 

0/2 

- 

5/42 

Murina sp. 

- 

- 

- 

0/3 

- 

0/3 

Myotis davidii 

3/5 

2/4 

0/1 

0/1 

- 

5/11 

Myotis siligorensis 

1/4 

- 

- 

- 

- 

1/4 

Plecotus auritus 

- 

- 

- 

0/4 

- 

0/4 

Rhinolophus affinis 

- 

0/3 

- 

- 

- 

0/3 

Rhinolophus ferrumequinum 

- 

0/8 

0/18 

2/157 

- 

2/183 

Rhinolophus luctus 

- 

- 

0/2 

- 

0/1 

0/3 

Rhinolophus macrotis 

1/1 

- 

- 

- 

- 

1/1 

Rhinolophus pearsonii 

2/30 

0/7 

- 

- 

1/21 

3/58 

Rhinolophus pusillus 

1/22 

0/45 

0/57 

0/29 

0/1 

1/154 

Rhinolophus rex 

1/23 

- 

- 

- 

- 

1/23 

Rhinolophus sinicus 

5/83 

- 

- 

- 

3/136 

8/219 

Rhinolophus thomasi 

- 

- 

- 

- 

i/i 

i/i 

Rhinolophus monoceros 

0/1 

0/4 

0/2 

- 

28/112 

28/119 

Total 

29/385 

7/111 

2/100 

2/199 

33/272 

73/1067 


Note: no animals were captured. 


Coronavirinae defined by the International Committeeon Taxomony of 
Viruses (ICTV) (de Groot et al., 2011). As these viruses were from 
Rhinolophus bats we therefore designed them as SARS-related 
Rhinolophus bat coronaviruses (SARSr-Rh-BatCoV). Among these, 33 
SARSr-Rh-BatCoVs were identified in 28 R. monoceros, 1 R. pearsonii, 
3 R. sinicus and 1 R. thomasi sampled from the city of Longquan, 
Zhejiang province. Similarly, 2 SARS-related CoVs were identified in R. 
ferrumequinum sampled from the city of Jiyuan, Henan province, 
while the remaining 6 viruses were identified in 5 R. sinicus, and 1 R. 
rex sampled from the county of Anlong, Guizhou province. 

On the RdRp phytogeny (Fig. 2) all known SARSr-Rh-BatCoVs from 


China could be divided into four clusters, within which the newly- 
identified viruses fell into three clusters that reflect their geographic 
origins. Specifically: (i) the viruses identified in Rhinolophus bats 
sampled in Zhejiang province (denoted Rhinolophus bat Longquan-) 
were closely related to each other and clustered with Rhinolophus bat 
CoV HKU3 sampled from R. sinicus in Hong Kong (Lau et al., 2005); 
(ii) The viruses identified in R. ferrumequinum from Jiyuan in Henan 
province (Jiyuan-84 and Jiyuan-331) formed a cluster with those 
viruses identified in R. ferrumequinum from other regions of China. 
The viruses within the cluster were from central China (Henan, Hubei 
and Shaanxi provinces), with the exception of the lineage comprising 


RdRp-nt 


S-nt 


- BtCoV BM48-31/BGR/2008|Bulgaria|R?/no/op/7usb/as// 


ggr BtCoV/279/2005|Hubei|R macrotis 


99j 


Bat SARS CoV Rml |Hubei|R macrotis I 

iqi- Bat SARS CoV HKU3|Hongkong|R. sinicus i 

'—*■ Bat SARS-related CoVs Longquan- N=33|Zhe]iang| R monoceros, R.pearsonn 

R. sinicus, R. thomasi 

-BtCoV Rp/Shaanxi2011 |Shaanxi|R pusillus 

BtRf-BetaCoV/HeB2013|Hebei|R ferrumequinum 
- BtRf-BetaCoV/SX2013|Shanxi|R ferrumequinum 
Bat SARS-related CoV Jiyuan-84|Henan|R ferrumequinum 
I Batcoronavirus isolate JTMC\5\M\n\R.ferrumequinum 
I I BtRf-BetaCoV/JL2012|Jllin|R ferrumequinum 
93| r Bat SARS-related CoV Jiyuan-331 |Henan|R ferrumequinum 
93|r BtCoV/273/2005|Hubei|R ferrumequinum 
Bat SARS CoV Rfl |Hubei|R ferrumequinum 
— BtCoV Cp/Yunnan2011|Yunnan|Cbaerepbonp//cafa 
SARS CoV BJ01|China|patient 
Civet SARS CoV SZ3|Guangdong|Paguma larvata 
SARS CoV WH20|China|patient 
SARS CoVTOR2|Canada|patient 
- BatSARS-like CoVYNLF_34C|R ferrumequinum 
9 gi BatSARS-like CoVRsSHC014|Yunnan|R sinicus 
BatSARS-like CoV Rs3367| Yunnan |R sinicus 
BtRs-BetaCoV/YN2013|Yunan|R sinicus 
I— BtRs-BetaCoV/GX2013|Guangxi|R sinicus 
Bat SARS-related CoV Anlong-112|Guizhou|R sinicus 
Bat SARS-related CoV Anlong-103|Guizhou|R sinicus 

- Bat SARS CoV Rp3|Guangxi|R pearsoni 

- Bat SARS CoVRs672|Guizhou|R sinicus 
Bat SARS-related CoV Anlong-111 |Guizhou|R rex 
Bat SARS-related CoV Anlong-97|Guizhou|R sinicus 
Bat SARS-related CoV Anlong-300||Guizhou|R sinicus 
Bat SARS-related CoV Anlong-29|Guizhou|R sinicus 


-BtCoV BM48-31/BGR/2008|Bulgaria|R blasii 

98 [ BatSARS-like CoV Rs3367|Yunnan|R sinicus 
— BatSARS-like CoV RsSHC014|Yunnan|R sinicus 
r Civet SARS CoV SZ3|Guangdong|Paguma larvata 
M SARS CoVTOR2|Canada|patient 
9 g SARS CoV BJ01 |China|patient 
I SARS CoV WH20|China|patient 

-BtCoV Rp/Shaanxi2011 |Shaanxi| R. pusillus 

-BtCoV Cp/Yunnan2011 \Yur\r\an\Chaerephon plicata 

— BatSARS-like CoV YNLF_34C|Yunnan|R ferrumequinum 

100 i BtRf-BetaCoV/JL2012|Jllin|R ferrumequinum 

l Bat coronavirus isolate JTMC15|Jilin| R.ferrumequinum 
_ | Bat SARS CoV Rfl |Hubei|R ferrumequinum 
BtCoV/273/2005|Hubei|P. ferrumequinum 
Bat SARS-related CoV Jiyuan-84|Henan|R. ferrumequinum 
Bat SARS-related CoV Jiyuan-331 |Henan|R ferrumequinum 
BtRf-BetaCoV/SX2013|Shanxi|R ferrumequinum 
BtRf-BetaCoV/HeB2013|Hebei|R ferrumequinum 
Bat SARS-related CoV Anlong-300||Guizhou|R sinicus 
Bat SARS CoV Rs672|Guizhou|R. sinicus 

BtRs-BetaCoV/YN2013| Yunan|R sinicus 
Bat SARS-related CoV Anlong-103|Guizhou|R sinicus 
Bat SARS-related CoV Anlong-112|Guizhou|R sinicus 
Bat SARS CoV Rp3|Guangxi|R pearsoni 
iqq i Bat SARS CoV Rml |Hubei|R macrotis 
BtCo V/279/2005| H ubei | P. macrotis 
BtRs-BetaCoV/GX2013|Guangxi|R sinicus 
LIOSI Bat SARS CoV HKU3|Hongkong|R sinicus 

Ll0S Bat SARS-related CoV Longquan-140|Zhejiang|P/i/'no/opbus monoceros 
L Bat SARS-related CoV Longquan-131|Zhejiang| Rhinolophus monoceros 
Bat SARS-related CoV Longquan-79|Zhejiang|R/i/no/op/ius monoceros 
87 Bat SARS-related CoV Longquan-4|Zhejiang|R/?/no/op/7us monoceros 
Bat SARS-related CoV Longquan-7|Zhejiang|R/7ino/op/?us pearsonii 


Fig. 2. Phylogenetic analysis of the nucleotide sequences of the RdRp and S genes including those CoVs obtained here. Bootstrap values (> 70%) are shown at relevant nodes. The trees 
were mid-point rooted for clarity only. The scale bar depicts the number of nucleotide substitutions per site. 
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JMC15 and BtRf-BetaCoV/JL2012 identified in R. ferrumequinum 
sampled in northeastern China (Jilin province); (iii) The viruses 
identified in R. sinicus sampled from Anlong in Guizhou province 
clustered with those identified in Rhinolophus bats from southwestern 
China including Guangxi, Guizhou and Yunnan provinces (Ge et al., 
2013; Li et al., 2005). Strikingly, only the bat SARSr-Rh-BatCoVs from 
southwestern China exhibited a close evolutionary relationship with 
SARS-related human coronavirus (SARS-CoV) and SARS-related palm 
civet coronavirus (SARSr-CiCoV) (Tor2 and SZ3), suggesting that 
SARS-CoV may have originated in this region. Finally, within each of 
these three clusters the SARS-related coronaviruses clustered accord¬ 
ing to their geographic origins. 

Unlike the RdRp gene tree, all SARSr-Rh-BatCoVs and SARS-CoVs 
from China fell into two distant clades on the S gene phylogeny (Fig. 2). 
The first included the SARS-CoVs and two bat SARSr-Rh-BatCoVs 
(Rs3367 and RsSHC014), while the second comprised all the remain¬ 
ing bat SARSr-Rh-BatCoVs including the newly-identified CoVs, which 
could be further sub-divided into three clusters according to their 
geographic and/or host origins. Finally, additional analysis revealed 
that the SI and S2 gene tree topologies differed from that of the S gene 
as a whole (Fig. SI), suggestive of potential recombination events (see 
below). 

2.3. Characterization of the SARSr-Rh-BatCoV genome 

To better characterize these newly-identified SARSr-Rh-BatCoVs 
we recovered the complete genome sequences from each of the lineages 
described. Their genome sizes varied from 29,665 to 29,693 nucleo¬ 
tides, and shared similar genome organizations with known SARS- 
related CoV viruses (Fig. 3), including the putative transcription 
regulatory sequence (TRS) motif, 5 '-ACGAAC-3at the 3' end of the 
5' leader sequence and preceding each ORF except ORF7b. 
Additionally, a single long ORF8 was observed in all newly-identified 
SARSr-Rh-BatCoVs. 

The SARSr-Rh-BatCoVs from Longquan (Zhejiang) were closely 
related to the HKU3 strain with nucleotide identities ranging from 


94.7% to 97.0% and amino acid identities of 98.2-99.1% in the RdRp. 
Notably, however, in the nsp2 gene they differed at up to 23% at the 
amino acid level. Additionally, the Longquan-140 virus was markedly 
different from SARSr-CiCoV at the amino acid level in the nsp2 
(25.3%), S (20.3%), ORF3 (17.5%), and ORF8 (62.9%) gene sequences. 
The SARS-related viruses from Jiyuan (Henan) were closely related to 
BtRf-BetaCoV/HeB2013 and BtRf-BetaCoV/SX2013, with 99.0-99.2% 
nucleotide identities. Strikingly, the SARSr-Rh-BatCoVs sampled from 
Anlong (Guizhou) were closely related to SARS- and SARSr-CiCoVs, 
with > 94% nucleotide identities. 


2.4. Recombination of SARSr-Rh-BatCoVs 

We next conducted an analysis of possible recombination events 
using available genome sequence of SARSr-Rh-BatCoVs and other 
betacoronaviruses, including those discovered in this study and SARS- 
and SARSr-CiCoVs (Tor2 and SZ3). As noted in Table S2, the 
Longquan-140 virus was markedly different from HKU3 in the nsp2 
gene, suggestive of possible recombination. When the Longquan-140 
sequence was used as the query for a sliding widow analysis with HKU3 
as a potential parent, two recombination breakpoints, located at the 
nsp2 (nucleotide 1727) and nsp3 (nucleotide 3055) genes, were 
detected in Longquan-140 with strong p-values ( < 10 -25 ), and sup¬ 
ported by a similarity plot (Fig. 4). Indeed, there was an abrupt change 
in the topological position of Longquan-140 and HKU3 downstream of 
the first breakpoint (position 1727) and upstream of the second 
breakpoint (position 3055). Accordingly, the Longquan-140 genome 
can be divided into three regions with different evolutionary histories: 
regions A (nucleotides 1-1726), B 1727-3055) and C (3056 to the end 
of the genome sequence). In regions A and C, Longquan-140 was most 
closely related to HKU3 with 96.0% nucleotide similarities, while in 
region B it occupied basal position, suggesting the recombination event 
originated from a currently un-sampled parental virus. No other 
putative recombination events received significant statistical support. 
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Fig. 3. Genome organization of coronaviruses. The four CoVs discovered in this study are shown in bold. 
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(B) Region 1-1726 
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Betacoronavirus Erinaceus/VMC/DEU/2012 
Middle East respiratory syndrome coronavirus 
Bat coronavirus HKU9 


-Bat Hp-betacoronavirus/Zhejiang2013 

1 BtRf-BetaCoV/HeB2013 
Bat SARS-related CoV Jiyuan-84 
Bat SARS CoV Rfl 
BtCo V/273/2005 
■ Bat SARS-like CoV YNLF_34C 
■ Bat SARS CoV Rml 
r Bat SARS CoV HKU3 
97 I*- Bat SARS-related CoVs Longquan-140 
\ BtCoV Rp/Shaanxi2011 
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- BtCoV Cp/Yunnan2011 
Bat SARS CoV Rp3 
r BtRs-BetaCoV/YN2013 
Bat SARS-like CoV Rs3367 
Bat coronavirus WIV1 
Bat SARS CoV Rs672 
Civet SARS CoV SZ3 
SARS CoVTOR2 

Bat SARS-related CoV Guizhou-103 
' Bat SARS-like CoV RsSHC014 



Human coronavirusHKUl 
Murine hepatitis virus strain JHM 
Rat coronavirus Parker 
Mouse hepatitis virus strain MHV-A59 Cl 2 
— Betacoronavirus HKU24 
Bovine coronavirus 
Rabbit coronavirusHKUl 4 
Betacoronavirus ErinaceusA/MC/DEU/2012 
Middle East respiratory syndrome coronavirus 
Bat coronavirus HKU5 
Bat coronavirus HKU4 
Bat coronavirus HKU9 
BatHp-betacoronavirus/Zhejiang2013 

Bat SARS-related CoVs Longquan-140 
Bat SARS CoV Rml 

- BtCoV Rp/Shaanxi2011 
BtRf-BetaCoV/HeB2013 

Bat SARS-related CoV Jiyuan-84 

Bat SARS CoV Rfl 

BtCoV/273/2005 

BtRs-BetaCoV/GX2013 

Bat SARS-like CoV YNLF_34C 

BtRs-BetaCoV/YN2013 

Bat SARS CoV Rp3 

Civet SARS CoV SZ3 

SARS CoV TOR2 

Bat SARS-like CoV Rs3367 

Bat coronavirus WIV1 

Bat SARS-like CoV RsSHC014 

Bat SARS-related CoV Guizhou-103 

Bat SARS CoV Rs672 

Bat SARS CoV HKU3 

— BtCoV Cp/Yunnan2011 


(D) Region 3056-3’end 

Human coronavirus HKU1 
Rat coronavirus Parker 
Murine hepatitis virus strain JHM 
10 °t Mouse hepatitis virus strain MHV-A59 Cl 2 
Betacoronavirus HKU24 
Bovine coronavirus 
Rabbit coronavirus HKU14 
Betacoronavirus Erinaceus/VMC/DEU/2012 
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10 or Bat SARS CoV HKU3 
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Fig. 4. Recombination within the genome of the Longquan-140 virus. A sequence similarity plot (A) reveals two recombination breakpoints shown by black arrows with their locations. 
The plot shows genome-scale similarity comparisons of the Longquan-140 (query) against HKU3 and other selected SARS-related CoVs. Phylogenies of regions A, B and C are shown 
below the similarity plot. Numbers (> 70%) above or below branches indicate percentage bootstrap values. 


2.5. Newly-identified alphacoronaviruses in bats 

All remaining bat viruses discovered here belong to genus 
Alphacoronavirus, and had genome organizations similar to those of 
known alphacoronaviruses (Fig. 3). In the RdRp phylogeny these 
viruses fell into seven distinct lineages within three clusters (Fig. 5). 
The first cluster included the two newly-identified lineages, comprising 
the virus (Neixiang-31) identified in one Myotis davidii bat from 
Neixiang (Henan) and those identified in Hipposideros armiger, M. 
davidii, M. siligorensis, R. macrotis, R. pearsonii, R. pusillus bats from 
Anlong (Guizhou). Further comparison of the CoV replicase domains 
(ADRP, 3CLpro, RdRp, Hel, ExoN, NendoU and O-MT) revealed that 
these newly-identified bat viruses, as well as the bat virus BtMr- 


AlphaCoV/SAX2011 (NC_028811.1), exhibited more than 10% amino 
acid difference in all seven replicase domains (Table S3). These viruses 
also formed a distinct cluster in all gene trees, and exhibited a clear 
cluster according to their geographic origins (Fig. 5 and S2). Together, 
these data suggest that these viruses satisfy the species demarcation 
criterion defined by ICTV and should therefore be considered as a novel 
member of the genus Alphacoronavirus, which we denote as Neixiang 
Md bat coronavirus (NMBV). 

Within the second cluster, the viruses discovered in bats from both 
Anlong (Guizhou province) and Neixiang (Henan province) formed a 
distinct lineage that was closely related to Scotophilus bat coronavirus 
512 (Sc-BatCoV 512) (Tang et al., 2006). As these viruses exhibited 
10% amino acid difference from members of the genus 
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Fig. 5. Phylogenetic analyses of the amino acid sequences of the RdRp including those CoVs obtained here. Numbers (> 70%) above or below branches indicate percentage bootstrap 
values. The trees were mid-point rooted for clarity only. The scale bar represents the number of amino acid substitutions per site. 


Alphacoronavirus in the conserved replicase domains (with the 
exception of O-MT to Sc-BatCoV 512 (9.3%; Table S2)) it is possible 
that Anlong-190 and Neixiang-32 represent a novel CoV species which 
we denote as Anlong Ms bat coronavirus. However, further studies are 
clearly needed to determine whether this virus indeed represents a 
novel coronavirus. The remaining viruses, which were discovered in 
bats sampled from Lushi and Neixiang and designated as Lushi Ml bat 
coronaviruses, formed another distinct lineage that showed a close 
evolutionary relationship with the bat virus JTAC2 (KU182966). 
Remarkably, these viruses were most closely related to porcine 
epidemic diarrhea virus (PEDV) in all five gene trees (Fig. 5 and S2) 
and exhibited > 10% difference in six conserved replicase domains 
from known members of the genus Alphacoronavirus in other genes 
(Table S3). However, the O-MT gene of Lushi Ml bat coronaviruses was 
most closely related to that of PEDV with 7.3% amino acid difference, 
indicating that it is a novel variant of PEDV (designed as Lushi Ml bat 


coronavirus). Clearly, these data support the evolutionary origin of 
PEDV from bats (Huang et al., 2013). 

Within the third cluster, the newly-identified bat viruses fell into 
three lineages. The newly-identified virus (Neixiang-64) in Miniopterus 
schreibersii from Neixiang formed a lineage with viruses in M. 
fuliginosus (KJ473799.1), and showed a close relationship with the 
HKU8 virus (Chu et al., 2008). Notably, the viruses from Anlong were 
grouped into two lineages, one of which contained Miniopterus bat 
coronavirus 1 A AFCD62 (Chu et al., 2008). In sum, our data reveal a 
high genetic diversity of alphacoronaviruses from bats in China. 

2.6. Phylogenetic relationships between newly-identified viruses and 
their bat hosts 

Comparison of the tree topologies of SARSr-Rh-BatCoVs and their 
bat hosts revealed strong incongruence between the SARSr-Rh- 
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(A) SARS-related CoVs-Bats 


(B) Alpha CoVs-Bats 




— SARS coronavirus TOR2 

— Homo sapiens 
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— Bat coronavirus Cp/Yunnan2011 

— Chaerephon plicala 

— BtCoV BM48-31 

— Rhinolophus blasii 

— Bat SARS-like CoV YNLF_34C 

— Bat SARS-related CoV Jiyuan-84 

— Rhinolophus ferrumequinum 

-Bat SARS CoV HKU3 
-Bat SARS-like CoV Rs3367 

— Bat SARS-related CoV Anlong-103 
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— Bat SARS-related CoVs Longquan-103 

— Rhinolophus thomasi 

— Bat SARS-related CoVs Longquan-7 
—Bat SARS coronavirus Rp3 
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shown in blue while the host phylogeny is shown in black. The host tree was based on mitochondrial cytochrome b gene sequences, and the coronavirus trees were based on the RdRp 
gene. Filled circles at the nodes indicate co-divergence events, empty circles indicate lineage duplication events, arrows indicate host-switching events, while dotted lines indicate loss 


events. 
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BatCoVs and their Rhinolophus bats hosts at the species level, with 
only two likely co-divergence events (Fig. 6A). Similarly, our co- 
phylogenetic analysis of alphacoronaviruses and their bat hosts pro¬ 
vided evidence for 6-7 co-divergence events, 11-12 host switching 
events, 0 lineage duplication, 1-2 losses and 0 failure to diverge events 
(Fig. 6B). Overall, our co-phylogenetic analysis did not identify 
significant congruence between the phylogenetic trees of viruses and 
their bat hosts (P=0.04). 

Clearly, more SARSr-Rh-BatCoVs were identified in R. sinicus and 
R. ferrumequinum bats sampled from a variety of locations (Fig. 2), as 
well as in R. monoceros sampled from Zhejiang province. In addition, 
the same virus could be identified in several species from the same 
geographic locality. For example, SARSr-Rh-BatCoVs were identified in 
four species of Rhinolophus bats from Longquan (Zhejiang) and in two 
bat species from Anlong (Guizhou). Similarly, Neixiang Md bat 
coronavirus (alphacoronavirus) was identified in six bat species 
sampled in Anlong (Guizhou). Hence, it is possible that high contact 
rates among some animal species may enable the acquisition and 
spread of CoVs. 

3. Discussion 

Since the discovery of SARSr-Rh-BatCoVs in bats in 2005 (Lau 
et ah, 2005; Li et ah, 2005), intense effort has focused on characterizing 
additional CoVs in bats, leading to the identification of diverse SARS- 
like viruses and other bat CoVs worldwide (Drexler et ah, 2014; Hu 
et ah, 2015; Huang et a, 2016; Woo et ah, 2006). More importantly, 
recent studies provide more evidence that human CoVs including 
MERS and SARS viruses may have their ultimate ancestry in bats 
(Annan et ah, 2013; Corman et ah, 2014, 2015; Ge et ah, 2013; Huynh 
et ah, 2012; Hu et ah, 2015; Ithete et al., 2013; Tao et ah, 2017). These 
results highlight the potential significance of detecting and character¬ 
izing of CoVs in bats before their emergence in humans. Herein we 
describe diverse SARSr-Rh-BatCoVs and alphacoronaviruses (includ¬ 
ing novel species) in 13 species of bats sampled from three provinces in 
China, with an overall prevalence of 6.84%. As such, these data reveal 
both the high genetic diversity and the wide geographic distribution of 
CoVs in diverse bats in China. 

Among the known SARSr-Rh-BatCoVs, those discovered by Yang 
et ah (2015) are most closely related to human SARS CoVs in the S 
gene ( > 90% amino acid similarities), but distant in the ORF8 gene ( < 
43% amino acid similarities). Interestingly, other CoVs have been 
described that are closely related to SARS-CoVs in the ORF8 gene, but 
relatively distant in the S, nsp2, nspll, and ORF3 genes (Table S2) 
(Lau et ah, 2015; Wu et ah, 2016). In this study, with the exception of 
the S and ORF3 genes, the viruses discovered in bats sampled in 
Guizhou province were most closely related to SARS-CoVs including 
the ORF8 gene. That all these SARSr-Rh-BatCoVs were sampled in 
southwestern China provides evidence for the origin of SARS-CoVs 
from bats in this region. 

To date, SARSr-Rh-BatCoVs have been detected in bats sampled 
from diverse geographic regions in China as well as in Europe (Drexler 
et ah, 2010). Notably, the majority of SARSr-Rh-BatCoVs have been 
discovered in Chinese Rhinolophus bats, while only one study reported 
the detection of SARS-related CoVs in Chaerephon plicata bats 
sampled from Yunnan province (Yang et ah, 2013). In our study a 
total of 1067 bats from 21 bat species were captured, with SARSr-Rh- 
BatCoVs identified in 5 species of Rhinolophus bats. Although previous 
studies indicated that R. sinicus and R. ferrumequinum bats exhibited 
a higher prevalence of SARS-related viruses (Lau et ah, 2005; Li et ah, 
2005; Tang et ah, 2006), more SARSr-Rh-BatCoVs were identified in R. 
monoceros bats (Table SI). In sum, these data suggest that SARSr-Rh- 
BatCoVs may be specifically associated with Rhinolophus bats. 

It is commonly stated that all the alphacoronaviruses originate from 
bat species (Woo et ah, 2012), as may also be the case for the human 
CoVs NL63 and 229E (Corman et ah, 2015; Tao et ah, 2017). In 


addition, it is believed that PEDV might have originated from bats, 
although direct evidence is lacking (Huang et ah, 2013). In this study, 
Lushi MI bat coronaviruses discovered in bats ( M. leucogaster, M. 
schreibersii ) from Lushi and Neixiang exhibited a close relationship to 
PEDV, especially in the O-MT domain (92.7% amino acid identity). In 
addition, all other viruses within the cluster were sampled from bats. 
Hence, these data provide more compelling evidence that PEDV may 
have originated from bats. 

The evolutionary history of RNA viruses is characterized by both 
host switching and co-divergence (Li et ah, 2015; Shi et ah, 2016), 
which also appears to be true of coronaviruses (Annan et ah, 2013; Cui 
et ah, 2007; Drexler et ah, 2014; Lau et ah, 2010, 2012, 2013; Tang 
et ah, 2006; Vijaykrishna et ah, 2007; Wertheim et ah, 2013; Woo et ah, 
2009, 2012). Similarly to previous studies (Drexler et ah, 2010; Ge 
et ah, 2013; Lau et ah, 2005; Li et ah, 2005; Wu et ah, 2016; Yang et ah, 
2013), SARSr-Rh-BatCoVs were mainly identified in Rhinolophus bats, 
but not in other bats even when sampled in the same locality (e.g. in 
Guizhou and Henan). In contrast, alphacoronaviruses were mainly 
identified in non-Rhinolophus bats and no alphacoronaviruses were 
discovered in the colony that only comprised Rhinolophus bats. Finally, 
it was noteworthy that a SARSr-Rh-BatCoVs and Neixiang Md bat 
coronavirus (alphacoronavirus) were identified in several bat species 
(from up to three families) in the localities of Anlong and Longquan, 
indicative of local inter-species transmission. Hence, these data suggest 
that high contact rates among specific animal species enable the 
acquisition and spread of CoVs. 

4. Material and methods 

4.1. Bat trapping and specimen collection 

Bats were captured alive with mist nets or harp traps in caves of 
natural roosts in Guizhou, Henan, and Zhejiang provinces during 
2012-2015 (Fig. 1). Bat species were initially identified by morpholo¬ 
gical examination and further confirmed by sequence analysis of the 
mt -cyt b gene (Guo et al., 2013). All bats were anesthetized before 
surgery with every effort made to minimize suffering. Tissue samples, 
including those from the rectum, were collected from bats for 
coronavirus detection. 

4.2. DNA and RNA extraction, PCR and sequencing 

We extracted total DNA from bat tissue samples using the DNeasy 
Blood & Tissue kit (QIAGEN, Valencia, USA) according to protocols 
suggested by the manufacturer. Total RNA was extracted from fecal or 
tissue samples using TRIzol (Invitrogen, Carlsbad, CA) according to the 
manufacturer's instructions. Coronavirus RNA was detected by RT- 
PCR as described previously (Lau et al., 2005; Wang et al., 2015). 
Other coronavirus gene sequences were amplified using the primers 
designed based on the conserved regions of known genome sequences. 

RT-PCR amplicons were purified using the QLAquick Gel Extraction 
kit (Qiagen, Valencia, USA) according to the manufacturer's recom¬ 
mendations. Purified DNA < 700 bp was subjected to a direct sequen¬ 
cing protocol, while purified DNA > 700 bp was cloned into pMD18-T 
vector (TaKaRa, Dalian, China), which was subsequently transformed 
into JM109-143 competent cells. DNA sequencing was performed with 
Applied Biosystems 377 gene sequencers. 

4.3. Complete genome sequencing 

Four representatives of CoV in bats were selected for full-genome 
sequencing. The initial PCR primer sets for PCR were designed from 
each pan-CoV amplicon sequence and/or from a conserved region in 
the CoV RdRp. As required, walking primers were designed for further 
PCR and sequencing. All primer sets used in this study are available 
upon request. 
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All sequences generated in this study have been deposited in 
GenBank and assigned accession numbers KF294268-KF294282, 
KF294373-KF294378, KF294381-KF294383, KF294420-KF294457, 
KY770850-KY770860. 

4.4. Phylogenetic analysis of CoV sequences 

In addition to the sequences recovered here, reference sequences 
that cover the phylogenetic diversity of CoVs were compiled for 
evolutionary analyses. Accordingly, sequence alignments were per¬ 
formed using the MAFFT algorithm (Katoh and Standley, 2013). After 
alignment, gaps and ambiguously aligned regions were removed using 
Gblocks (v0.91b) (Talavera and Castresana, 2007). Phylogenetic trees 
were estimated using the maximum likelihood (ML) method imple¬ 
mented in PhyML v3.0 (Guindon et al., 2010) with bootstrap support 
values calculated from 1000 replicate trees. The best-fit amino acid 
substitution models were determined using MEGA version 5 (Tamura 
et al., 2011). 

4.5. Recombination analysis 

Full-length genomic sequences of the SL-CoVs Longquan-140, 
Anlong-103 and Jiyuan-84 viruses were aligned with those of bat 
SARSr-Rh-BatCoVs and other betacoronaviruses using MEGA 5.0. The 
aligned sequences were preliminarily scanned for recombination events 
using the Recombination Detection Program 4.0 (RDP4), employing 
the RDP, GENECONV, and Bootscan methods (with default para¬ 
meters) (Martin et al., 2015). The potential recombination events 
suggested by RDP (i.e. those with strong P values) were investigated 
further by similarity plots implemented in Simplot 3.5.1 (Lole et al., 
1999). For each of the putative recombinant regions, phylogenies were 
estimated using the maximum likelihood method available in PhyML 
v3.0 (Guindon et al., 2010). 

4.6. Analysis of CoV and host co-phylogenies 

Co-phylogenetic analyses of bats hosts and their associated cor- 
onaviruses were conducted using the heuristic event-based method 
available in the Jane software package (Conow et al., 2010). We 
reconstructed patterns co-divergence (and hence cross-species trans¬ 
mission) using a weight of 0 for co-divergence and a weight of 1 for 
duplication, host switching, lineage loss and failure to diverge. We used 
Jane with 100 generations and a population size of 100 as parameters 
for the genetic algorithm. To test the probability of observing the 
inferred co-divergence number by chance, we employed the random tip 
mapping method with the generation number =100, population size 
=100, and sample size =50. 
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